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ABSTRACT 


This  report  describes  the  development  of  a procedure  for  evaluating 
research  through  the  report  as  part  of  a comprehensive  research  program  for  the 
study  of  potentiality  for  and  successful  performance  in  research  work.  The  program 
is  sponsored  by  the  Personnel  and  Training  Branch,  Psychological  Sciences  Division, 
Office  of  Naval  Research.  The  total  program  of  research  includes  the  following 
steps: 

1.  Determination  of  the  critical  requirements  for  successful  participation 
in  research  and  engineering  work 

2.  Development  of  an  aptitude  test  for  the  selection  of  research  personnel 

3.  Development  of  tests  to  measure  proficiency  in  specific  areas  of 
scientific  work 

4.  Development  of  procedures  for  evaluating  the  job  performances  of  research 
personnel 

5.  Determination  of  the  predictive  value  of  the  tests  developed  in  steps  (2) 
and  (3}  using  the  procedures  developed  in  step  (4)  to  obtain  evaluations  of 
personnel  for  comparison  with  test  predictions. 

The  first  three  of  the  above  steps  have  been  completed.  Preliminary  work 
on  step  4 has  previously  been  reported.  The  present  report  deals  with  additional 
work  on  that  step. 

A preliminary  version  of  the  Record  Form  for  Evaluating  Research  Through 
the  Report  was  constructed  directly  from  the  first  five  areas  of  the  Observational 
Record  Form  for  Research  Personnel  which  was  developed  in  a previous  study.  The 
areas  included  on  the  form  were: 

I.  Formulating  Problems  and  Hypotheses 

II.  Planning  and  Designing  the  Investigation 

III.  Conducting  the  Investigation 

IV.  Interpreting  Research  Resul 

V.  Preparing  Reports 

The  preliminary  form  and  instructions  for  its  use  were  revised  on  the 
basis  of  results  from  several  small  scale  tryouts. 

Chemists,  physicists,  and  engineers  who  were  listed  in  American  Men  of 
Science  as  fellows  officers,  or  past  officers  of  their  respective  professional 
societies  were  contacted  by  mail  and  each  requested  to  select  one  report  of 
especially  effective  and  one  report  of  relatively  mediocre  research.  The  selected 
reports  were  then  evaluated  by  other  workers  in  the  field  using  a trial  Record  Form 
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composed  of  114  items  descriptive  of  effective  and  114  items  descriptive  of  in- 
effective research  performance.  The  trial  form  was  revised  on  the  basis  of  results 
from  the  evaluations  and  the  comments  of  evaluators  concerning  the  form.  Fifty 
items  describing  effective  performance  were  selected  for  the  revised  form.  No 
ineffective  items  were  selected  since,  as  a group,  they  were  less  used  and  did 
not  seem  to  function  as  well  as  the  effective  items. 

The  method  of  developing  the  revised  Record  Form  for  Evaluating  Research 
Through  the  Report  should  maximize  its  value  as  an  aid  to  careful  evaluation. 
However,  its  usefulness  for  evaluation  has  not  been  fully  determined.  A study  for 
this  purpose  is  recommended. 

The  estimate  of  inter -evaluator  reliability  obtained  in  this  study  for  the 
50  items  selected  for  the  revised  form  was  low.  The  estimates  of  intra-evaluator 
reliability  were  rather  high.  Although  the  estimates  may  be  somewhat  in  error 
due  to  the  effect  of  selecting  the  items  from  a large  number  of  other  items,  the 
evidence  obtained  supports  a conclusion  that  standards  for  evaluating  research 
through  the  report  using  items  describing  effective  research  performance,  vary 
greatly  from  one  evaluator  to  another.  It  is  therefore  recommended  that  item 
responses  not  be  combined  into  a total  score  except  in  certain  special  circum- 
stances. Individual  items  should  be  used  by  the  evaluator  as  an  aid  to  arriving 
at  a careful  over-all  judgment  about  the  value  of  the  research.  It  is  suggested  that 
methods  for  reducing  the  variation  among  evaluators'  standards  may  be  a fruit- 
ful area  for  further  research. 
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Chapter  I 


THE  PROBLEM 
Introduction 


The  impact  upon  modern  society  of  research  and  the  many  complex  tech- 
nologies to  which  it  is  wedded  hardly  needs  mention.  We  are  continually  reminded 
by  the  daily  press,  popular  magazines,  radio  and  television  that  research  and 
technics  are  playing  an  increasingly  important  role  in  our  daily  lives.  There  are 
many  who  decry  the  effect  of  scientific  advance  upon  our  way  of  life  and  there  are 
perhaps  still  more  who  laud  it  as  contributor  and  defender.  But  no  one  detracts 
from  its  importance. 

Effective  communication  among  research  workers  is  an  essential  and  time 
honored  aspect  of  research.  For  without  effective  communication  each  research 
worker  must  spend  so  much  time  unearthing  findings  which  have  already  been 
developed  elsewhere  that  scientific  progress  is  greatly  impeded.  There  are  many 
channels  through  which  research  workers  communicate  --  including  professional 
meetings,  postgraduate  seminars,  and  the  publication  of  basic  reference  books, 
but  it  seems  safe  to  say  that  the  single  most  important  mode  for  communicating 
research  knowledge  is  publication  in  professional  journals.  It  is  therefore  surprising 
that  the  effectiveness  of  this  communication,  that  is,  the  amount  of  information 
conveyed  to  different  readers  by  scientific  reports,  has  not  been  investigated.  A 
survey  of  the  literature*  concerning  scientific  manpower  revealed  no  systematic 
studies  of  this  problem. 

* 

The  main  purpose  of  the  present  investigation  was  to  develop  a new  instru- 
ment which  might  be  useful  to  persons  responsible  for  evaluating  research  through 
its  report.  However;  much  of  the  information  obtained  in  this  study  should  be 
relevant  to  an  assessment  of  the  agreement  among  research  workers  concerning 
research  evaluated  through  the  report.  Such  an  assessment  should  provide  valuable 
knowledge  concerning  the  effectiveness  with  which  written  reports  communicate  to 
research  workers.  For  an  important  function  of  scientific  report  writing  is  to 
communicate  to  the  qualified  reader  the  contribution  which  has  been  made  by  the 
reported  research.  Unless  there  is  agreement  among  qualified  readers  concerning 
this  contribution  it  is  apparent  that  sc  ntific  communication  is  not  functioning 
optimally.  Many  of  the  data  gathered  the  subject  research  project  are  relevant  to 
this  problem.  There  is  a real  need  for  .ocedures  and  instruments  which  will  help 
to  increase  the  effectiveness  of  research  performance.  There  seems  to  be  little 
likelihood  that  the  demands  for  results  of  governmental  research  operations  will  be 
reduced  in  the  foreseeable  future.  Rather,  the  need  for  new  research  information 
seems  to  be  steadily  increasing.  On  the  other  hand,  it  appears  that  governmental 


* Weislogel,  M.  H.  and  Altman,  J.  W.  Abstracts  of  Literature  Concerning 
Scientific  Manpower.  Pittsburgh:  American  Institute  for  Research,  1952, 
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laboratories  are  reaching  the  saturation  point  in  the  number  of  professional  re- 
search workers  they  will  be  able  to  absorb.  As  the  numbers  of  professional 
workers  begin  to  level  off,  while  the  demand  for  results  continues  to  mount,  there 
are  two  major  ways  of  preventing  a widening  of  the  breach  between  the  Government's 
imperative  need  for  effective  research  and  the  ability  of  the  laboratories  to  pro- 
duce research. 

The  first  way  to  increased  research  effectiveness  is  through  careful 
evaluation  and  development  of  currently  employed  research  workers.  This  includes 
matching  the  right  man  with  the  right  job,  letting  the  man  know  his  own  strengths 
and  weaknesses  in  an  objective  and  constructive  manner,  and  rewarding  outstand- 
ingly effective  performance  with  recognition  and  increased  responsibility.  The 
misplaced  or  unrecognized  worker  is  likely  to  seek  new  employment,  with  a result- 
ing waste  of  the  time  and  rtioney  that  went  into  recruiting,  orienting,  and  training 
him.  Even  if  he  remains  on  the  job,  his  achievement  will  fall  below  capacity. 

A second  way  to  improved  productivity  in  research  is  through  careful 
selection  and  placement  of  new  employees  and  selection  of  the  most  promising  can- 
didates for  advanced  training.  Performance  on  standardized  tests  and  records  of 
past  research  performance  are  useful  sources  of  information.  School  grades  are 
only  a special,  and  relatively  indirect,  index  of  past  research  performance  which 
may  best  be  used  in  conjunction  with  scores  on  carefully  constructed  standardized 
tests  and  more  direct  measures  of  previous  research  performance.  A careful 
appraisal  of  this  information  should  improve  identification  of  applicants  with 
superior  abilities  and  of  those  who  are  not  qualified  for  professional  research  work. 
It  should  also  provide  an  estimate  of  the  level  at  which  applicants  should  begin  work 
and  the  level  they  should  ultimately  attain.  This  would  improve  research  perform- 
ance in  at  least  three  ways:  1)  it  would  start  workers  with  superior  talents  on 
creative  research  early  in  their  career;  2)  it  would  place  satisfactory,  but  not 
outstanding,  applicants  at  a job  level  where  work  would  be  sufficiently  difficult  to 
present  a challenge  but  not  so  difficult  as  to  be  discouraging;  and  3)  it  would  reject 
employees  lacking  sufficient  potentiality  before  considerable  money  and  effort  were 
expended  on  them. 


A Long-range  Personnel  Research  Program 

Since  January  1948  the  Office  of  Naval  Research  has  sponsored  a com- 
prehensive program  of  the  American  Institute  for  Research  dealing  with  personnel 
research  problems  of  scientific  and  engineering  workers.  The  reports  of  work 
already  accomplished  have  been  published.  * The  total  program  includes  the 
following  step  8: 

1.  Determination  of  the  critical  requirements  for  successful  participation 
in  research  and  engineering  work 


* See  the  bibliography  of  these  reports  on  page  40. 


2.  Development  of  an  aptitude  test  for  the  selection  of  research  personnel 

3.  Development  of  tests  to  measure  proficiency  in  specific  areas  of  scientific 
work 

4.  Development  of  procedures  for  evaluating  the  job  performances  of  research 
personnel 

5.  Determination  of  the  predictive  value  of  the  tests  developed  in  steps  (2) 
and  (3)  using  the  procedures  developed  in  step  (4)  to  obtain  evaluations  of 
personnel  for  comparison  with  test  predictions. 

The  first  three  of  the  above  steps  have  been  completed.  Preliminary  work 
on  step  4 has  previously  been  reported.  The  present  report  deals  with  additional 
work  on  that  step.  A project  is  currently  in  progress  which  deals  with  step  5. 


Functions  of  a Procedure  for  Evaluating 
Research  Through  the  Report 

A procedure  for  evaluating  research  through  the  report  functions  both  in 
selection  of  new  employees  and  in  evaluating  the  performance  of  currently  employed 
workers.  It  functions  at  a number  of  levels.  Such  a procedure  might  conveniently 
be  used  to  evaluate  theses  or  dissertations  of  prospective  employees  who  are  recently 
graduated  from  school.  At  a higher  level  it  could  be  used  to  evaluate  the  published 
research  reports  of  more  experienced  personnel.  It  might  also  be  used  to  evaluate 
the  performance  of  personnel  engaged  in  more  or  less  independent  research  since 
the  performance  of  such  persons  is  frequently  little  known  except  through  the 
written  report. 

There  are  at  least  four  advantages  to  evaluation  of  research  through  the 
report.  The  report: 

1.  is  a permanent  record  of  performance; 

2.  can  be  made  available  to  several  judges  or  raters; 

3.  can  be  evaluated  by  persons  w o do  not  know  the  author,  thus  decreasing 
personal  bias; 

4.  can  be  evaluated  in  a single  and  relatively  brief  period,  rather  than 
requiring  a series  of  observations  of  job  performance. 


Objectives  of  the  Present  Project 

The  general  objective  of  the  project  was  to  develop  a procedure  which 
would  be  helpful  to  graduate  and  undergraduate  thesis  committees,  research  super- 
visors, potential  employers,  and  others  responsible  for  evaluating  research  through 
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the  report.  Specific  steps  of  the  project  include  ‘he  f, 

1.  Developing  pre-field  test  forms  o:  the  proredr?  fc-r  if*'*  , »* 

liminary  tryouts 

2.  Revising  the  preliminary  forms  to  develop  a trial  form  s -.e f t j.v  . e fc  ' a 

large  scale  field  test 

3.  Obtaining  an  independent  measure  of  the  effectiveness  of  the  •••  epo:  ;.s  * > 
be  evaluated  in  the  field  test 

4.  Conducting  a large  scale  field  test  to  provide  data  snitiV.*.  -o  • it 
the  trial  form  of  the  procedure 

5.  Revising  the  trial  form  on  the  basis  of  staMmlcu  l da..*  a-ic  t .• 

evaluators  obtained  in  the  field  test. 
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Chapter  II 


PLANNING  THE  RESEARCH 


Formulating  the  Problem 

In  the  spring  of  1951  Dr.  Paul  Horst  of  the  University  of  Washington  was 
asked  to  study  the  problem  of  evaluating  research  through  the  report  and  to  recommend 
a plan  for  developing  a suitable  evaluation  form.  His  recommendations,  in  part, 
were: 

1.  Construct  a check  list  of  items  describing  critical  behaviors  which  can  be 
evaluated  from  the  report.  Horst  pointed  out  that  since  the  Observational 
Record  Form  for  Research  Personnel,  * developed  from  a study  of  critical 
requirements  for  research  workers,**  is  believed  to  represent  the  most 
accurate  and  comprehensive  list  of  critical  behaviors  involved  in  success 
or  failure  of  research  personnel  which  is  currently  available  it  should 
provide  the  basis  for  an  experimental  report  evaluation  form.  He  further 
suggested  that  the  experimental  form  include  only  the  first  five  main  areas 
of  the  Observational  Record  Form,  since  the  last  three  areas  include 
critical  behaviors  which  would  probably  not  be  reported  in  the  research 
report.  The  first  five  areas,  on  the  other  hand,  include  behaviors  which 
might  be  given  in  a research  report.  These  five  areas  are  listed  below: 

I.  Formulating  Problems  and  Hypotheses 

II.  Planning  and  Designing  the  Investigation 

III.  Conducting  the  Investigation 

IV.  Interpreting  Research  Results 

V.  Preparing  Reports 

2.  Have  specialists  from  a variety  of  natural  science  and  engineering  fields 
and  research  functions  select  reports  which  are  clearly  examples  of  either 
especially  effective  or  relatively  mediocre  research. 

3.  Have  persons  specializing  in  the  appropriate  field  evaluate  the  research  by 
indicating  for  each  item  on  the  experimental  check  list  whether  or  not  there 
is  evidence  in  the  report  that  the  given  critical  behavior  occurred. 


* Weislogel,  Mary  H.  Procedures  for  Evaluating  Research  Personnel  with  a 
Performance  Record  of  Critical  Incidents.  Pittsburgh:  American  Institute 
for  Research,  1950. 

**  Flanagan,  John  C.  et  al.  Critical  Requirements  for  Research  Personnel. 
Pittsburgh:  American  Institute  for  Research,  1949. 

I 

I 


\ 


i 


- 5 - 


4.  Revise  the  experimental  check  list  on  the  basis  of  statistical  analysis  of 
the  item  responses  and  other  information  provided  by  participants  in  the 
study. 


Preliminary  Tryout  of  the  Experimental  Check  List 

A preliminary  Record  Form  for  Evaluating  Research  Through  the  Report 
was  made  up  directly  from  the  Observational  Record  Form  for  Research  Personnel. 
Thirteen  individuals  participated  in  the  preliminary  tryout  of  this  form.  Parti- 
cipants selected  reports  from  their  own  field  of  specialization  as  effective  or 
ineffective.  Eight  persons  evaluated  reports  judged  effective,  and  five  evaluated 
reports  judged  ineffective.  Four  of  those  who  evaluated  effective  reports  were 
psychologists,  two  were  chemists,  and  two  were  physicists.  Three  of  those  who 
evaluated  ineffective  reports  were  psychologists  and  two  were  physicists.  Data 
obtained  in  the  tryout  were  of  two  kinds,  entries  on  the  Record  Form  and  comments 
of  the  respondents.  A study  of  these  data  indicated: 

1.  the  preliminary  form  was  sufficiently  comprehensive;  no  additional  items 
needed  to  be  added  to  the  form. 

2.  where  effective  items  were  not  balanced  by  an  analogous  ineffective  item 
or  an  ineffective  item  was  not  balanced  by  an  analogous  effective  item 
participants  found  evaluation  difficult. 

3.  certain  changes  in  the  instructions  for  use  of  the  procedure  were  needed. 

It  was  also  decided  that  the  term  "ineffective"  was  probably  too  strong 
for  the  less  satisfactory  research.  The  term  "relatively  mediocre" 
seemed  more  appropriate  for  reports  to  be  selected  in  later  stages  of 
the  project. 

After  several  revisions  and  a tryout  with  six  research  psychologists  the 
preliminary  evaluation  form  was  used  by  two  physicists,  two  chemists,  and  two 
engineers.  Interviews  with  these  persons  indicated  that  the  preliminary  form  wc.s 
satisfactory  for  large  scale  tryout  but  that  additional  changes  in  the  instructions  for 
use  of  the  form  were  needed  in  the  interest  of  brevity  and  clarity.  The  required 
changes  were  made  in  the  instructions  and  they  were  judged  to  be  satisfactory  for 
large  scale  tryout. 


Obtaining  Reports  for  Evaluation 

Several  sources  for  reports  of  research  judged  to  be  either  especially 
effective  or  relatively  mediocre  by  specialists  in  a variety  of  natural  science  and 
engineering  fields  were  investigated. 

Doctoral  dissertations  selected  by  the  candidates'  major  advisers  would 
provide  complete  reports  of  original  research  which  was  the  major  responsibility 
of  a single  investigator.  The  major  adviser  seems  to  be  in  an  especially  good 


position  to  judge  the  effectiveness  of  the  individual  research.  However,  an  investi- 
gation of  the  university  inter-library  loan  system  indicated  that  the  obtaining  of  a 
sufficiently  large  number  of  dissertations  for  the  purposes  of  this  investigation  was 
not  feasible. 

Final  reports  of  research  for  distribution  within  the  producing  laboratory 
or  parent  organization  have  many  of  the  advantages  of  doctoral  dissertations.  Such 
reports  are  usually  quite  complete  and  can  be  judged  by  supervisory  personnel  under 
whom  the  research  was  conducted.  Communication  with  cognizant  persons  in  a 
research  foundation  and  industrial  laboratories  indicated  that  such  reports  are 
usually  highly  confidential  and  not  available  for  evaluation  by  persons  not  within  the 
organization.  The  problem  of  the  security  of  report  content  obtains  to  an  even  greater 
degree  for  most  governmental  laboratories.  It  was  pointed  out  that  reports  of  such 
research,  i.  e.  , of  a non -confidential  nature,  are  usually  published  either  in  scientific 
journals  or  as  mp.ivato  report--. 

T he  pubh.-li'd  literature,  particularly  th  • scientific  journals,  proved  to  be 
the  only  source  for  a sufficiently  large  number  of  available  reports.  One  limitation 
of  such  reports  is  the  relatively  condensed  treatment  which  it  required  for  public- 
ation in  most  journals  This  makes  a comprehensive  and  detailed  evaluation  of  the 
research  through  such  reports  more  difficult  than  evaluation  of  more  lengthy  and 
complete  research  reports 


Planning  1 1 1 e Re  -sear c h Design 

From  the  above  considerations  the  following  specific  steps  were  planned. 

1.  Obtain  selections  of  a large  number  of  research  reports  in  chemistry, 
physics  and  engineering  which  are  judged  by  highly  competent  persons  in 
the  field  to  represent  either  especially  effective  or  relatively  mediocre 
research.  These  persons  will  be  termed  "selectors." 

2.  Have  the  selected  reports  evaluated  by  other  workers  in  the  field  using  the 
trial  R«- co  rd  Form  fo r Evaluating  Re  sea  rch  Through  the  Report  These 
p.-rsons  will  be  ter.iie..  ""valuators.  " They  will  not  be  aware  of  the  selectors' 
judgment*  concerning  the  effectiveness  of  the  research  they  are  evaluating 
through  the  report, 

3.  Revise  the  trial  form  on  the  basis  of  results  from  the  evaluations  and  the 
comments  of  evaluators  concerning  the  form.  The  primary  method  of 
revision  will  be  the  selection  of  items  from  the  trial  form  which  show  the 
greatest  amount  of  agreement  between  the  selectors  judgment  and  the 
evaluators'  '-valuation  of  the  effectiveness  of  the  research  being  considered, 

4.  Have  each  of  a number  of  reports  evaluated  independently  by  two  persons. 

This  will  permit  an  estimation  of  the  inter-evaluator  reliability  of  the  items 
selected  for  the  revised  version  of  the  form.  The  item  inter-evaluator  re- 
liability coefficients  could  also  provide  a basis  for  selecting  items,  although 
it  was  not  planned  that  this  be  done. 


Chapter  III 


CONDUCTING  THE  RESEARCH 


Obtaining  Selections  of  Reports 

After  preliminary  investigations  had  been  completed,  the  first  step  m 
conducting  the  research  was  to  have  highly  competent  persons  in  special  areas  o 
physics,  chemistry,  and  engineering  select  a large  number  of  reports  as  repre- 
senting especially  effective  or  relatively  mediocre  research.  The  judgment  of 
these  selectors  was  the  criterion  of  research  excellence  with  which  results  on  the 
trial  Record  Form  was  compared.  Each  selector  was  requested  to  nomin?i!  ~ 1 
report  of  especially  effective  and  one  of  relatively  mediocre  research. 

The  questionnaire  method  seemed  to  be  the  most  feasible  way  of  obtaining 
report  selections  from  a large  number  of  persons  in  widely  scattered  geographical! 
areas.  Biographical  data  contained  in  American  Men  of  Science  provided  a cr  n- 
venient  source  for  compiling  a mailing  list. 

Requests  for  selections  of  reported  research  were  sent  to  5p  person 
whose  names  were  "starred"*  in  the  1944  edition  and  to  511  persons  listed  in  tli’ 
1949  edition  as  fellows,  officers,  or  past  officers  of  a professional  society  in 
their  main  field  of  interest.  Approximately  equal  numbers  of  chemists,  physicist 
and  engineers  were  contacted.  Two  follow-up  letters  were  sent  to  all  but  a small 
number  of  persons  contacted  early  in  the  study.  In  preliminary  mailings  various 
modifications  of  the  original  request  and  follow-up  letters**  were  tried  out,  in 
order  to  explore  the  possibility  of  increasing  the  proportion  of  returns  frou  la: 
mailings. 

In  reply  to  561  mail  requests  there  were  133  or  24  per  cent,  returns 
Each  completed  return  represents  two  report  selections,  one  of  especially 
effective  and  one  of  relatively  mediocre  research.  Differences  in  proportion 
of  returns  in  answer  to  various  modifications  of  the  request  and  follow-up  letters 
were  not  appreciable. 

Returns  were  reviewed,  as  they  came  in  from  selectors,  to  determine 
which  of  the  various  fields  were  represented  by  the  reports,  and  whether  the 
references  could  be  readily  located  in  university  or  technical  libraries.  There 
were  254  available  and  usable  research  reports  selected,  90  in  chemistry,  81  in 
physics,  and  83  in  engineering. 


* Starred  men  are  those  mentioned  most  frequently  by  their  colleagues  as 
outstanding  men  of  science.  Starring  was  discontinued  in  the  1949  edition. 

**  Copies  of  these  materials  are  shown  in  Technical  Appendix  A. 
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Obtaining  Evaluations  of  Research  Through  Reports 


The  second  step  in  conducting  the  research  was  to  have  a group  of 
evaluatois  read  the  reports  chosen  by  selectors  in  the  previous  step  and  use  the 
trial  Record  Form  in  evaluating  the  research  through  the  report.  Minimum 
requirements  for  evaluators  were  set  as  follows:  for  physics  and  chemistry, 
a doctor's  degree  or  a master's  degree  plus  two  years'  experience  in  research 
or  teaching;  and  for  engineering,  a master's  degree  or  a bachelor's  degree  plus 
three  years'  experience  in  research  or  teaching. 

Prospective  participants  were  contacted  by  mail  in  various  universities 
and  research  organizations.  Each  potential  evaluator  was  requested  to  indicate 
on  a data  form  his  willingness  to  participate  in  the  study  and  certain  information 
concerning  his  training  and  experience.  From  the  information  given  on  the  data 
form  it  was  usually  possible  to  provide  each  evaluator  with  a report  in  his 
general  area  of  specialization. 

Two  reports  were  usually  referred  to  each  evaluator,  but  never  more 
than  two.  Usually,  each  evaluator  was  sent  one  report  selected  as  an  example 
of  outstandingly  effective  and  one  report  selected  as  an  example  of  relatively 
mediocre  research.  A few  evaluators  evaluated  either  two  "especially  effective" 
or  "relatively  mediocre"  research  studies  and  a few  evaluated  only  one  study. 

The  evaluator,  of  course,  did  not  know  on  what  basis  the  reports  had  been 
selected. 


Two  hundred  and  two  reports  were  evaluated,  116  were  evaluated  only 
once,  while  each  of  the  remaining  86  was  independently  evaluated  by  two 
persons.  These  independent  evaluations  were  obtained  to  permit  an  estimation 
of  the  inter -evaluator  reliability  of  the  procedure. 


Chapter  IV 


RESEARCH  RESULTS 
Evaluations  of  Research  Through  Reports 

The  method  of  obtaining  evaluations  was  discussed  in  the  previous  chapter. 

An  effort  was  made  to  obtain  approximately  equal  numbers  of  evaluations  from  the 
fields  of  chemistry,  physics,  and  engineering  and  to  include  evaluations  of  research 
from  most  of  the  more  important  specializations  within  each  field.  This  effort  was 
made  because  it  was  thought  that  a Record  Form  revised  on  the  basis  of  results  from 
a sample  including  representative  reports  from  the  three  disciplines  would  have 
wider  applicability  than  if  revised  on  the  basis  of  results  from  a more  restricted 
sample.  The  number  of  evaluations  obtained  for  each  specialization  is  shown  in 
Table  I.  The  dual  evaluations  of  the  43  especially  effective  and  43  relatively  mediocre 
reports  were  used  to  obtain  an  estimate  of  the  inter- evaluator  reliability  of  the  re- 
vised Record  Form.  The  trial  Record  Form  and  instructions  for  its  use  are  pre- 
sented on  pages  11-19. 


Table  I 


Reports  Evaluated  in  Physics,  Chemistry,  and  Engineering 


Evaluated  by 

Evaluated  by 

Total  Reports 

Total  Number 

One  Person 

Two  Persons 

Evaluated 

of  Evaluations 

E* 

M** 

E 

M 

E 

M 

E 

M 

19 

21 

15 

34 

36 

49 

51 

Inorganic -Analytical 

3 

5 

4 

4 

7 

9 

f 

13 

Physical 

4 

5 

6 

4 

10 

9 

n H 

13 

Organic 

8 

11 

5 

5 

13 

16 

Ba 

21 

Biochemistry 

4 

0 

0 

2 

4 

2 

■n 

4 

Physics 

23 

20 

12 

14 

35 

34 

47 

48 

Atomic  -Nuclear 

9 

5 

5 

5 

14 

10 

19 

mm 

Electricity 

7 

7 

5 

6 

12 

13 

17 

mm 

Mechanics 

1 

2 

1 

0 

2 

2 

i-MM 

Spectroscopy 

3 

2 

1 

3 

4 

5 

5 

8 

Meteorology 

0 

1 

0 

0 

0 

1 

0 

1 

Sound 

3 

3 

0 

0 

3 

3 

3 

3 

16 

17 

16 

mm 

32 

31 

48 

45 

Chemical 

5 

4 

4 

2 

9 

6 

13 

8 

Civil 

0 

1 

0 

1 

0 

2 

3 

Electrical 

2 

5 

4 

3 

6 

8 

Mechanical 

3 

2 

2 

2 

5 

4 

Metallurgical 

3 

3 

0 

0 

3 

3 

. Petroleum 

0 

0 

1 

1 

1 

1 

Aeronautical 

5 

5 

5 

10 

10 

Ceramic 

2 

0 

1 

2 

1 

Textile 

mif 

1 

l 

■ ■ 

2 

2 

All  Fields 

58 

58 

43 

43 

101 

101 

144 

144 

* E indicates  reports  selected  as  representing  especially  effective  research. 
**  M indicates  reports  selected  as  representing  relatively  mediocre  research. 
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1.  Evaluate  the  research  by  filling  in  the  Record  Form  for  Evaluating  Research  Through 
the  Report.  It  is  organized  as  an  outline  with  major  and  sub-headings  with  numbered 
categories  below  them.  Each  category  is  composed  of  an  effective  item  to  the  left  and 
an  ineffective  item  to  the  right.  There  is  a blank  space  beside  each  item. 

Place  one  of  the  following  symbols  in  each  blank  space: 

NA  = This  item  could  not  have  occurred  because  such  activity  is 
not  applicable  to  this  research. 

? » This  item  might  have  occurred,  but  the  reader  cannot  tell 

from  this  report. 

0 = This  item  did  not  occur. 

X = This  item  occurred,  but  was  not  at  all  important  to  conducting 
this  research  well  or  poorly. 

\]  = This  item  occurred  and  was  of  some  importance  to  conducting 
this  research  well  or  poorly. 

IMPORTANT:  THE  SPACE  BESIDE  EVERY  ITEM  SHOULD  BE  MARKED  WITH  SOME 

SYMBOL.  THE  TWO  ITEMS  FOR  ANY  CATEGORY  WILL  NOT  NECESSARILY  BE  GIVEN 

THE  SAME  NOR  NECESSARILY  DIFFERENT  SYMBOLS,  SINCE  EACH  ITEM  WILL  BE 

EXAMINED  SEPARATELY. 

2.  Complete  the  section  on  the  last  page  of  the  Record  Form  which  asks  for  your  over- 
all judgment  of  the  contribution  of  the  research. 

3.  Circle  the  symbol  for  each  item  which  in  itself  made  this  study  of  definite  value  (or 
reduced  the  value  of  this  research  appreciably)  and  had  a noticeable  effect  upon  your 
over-all  judgment  of  the  contribution  of  this  research. 

4.  Of  the  items  circled  under  3 above,  place  an  additional  circle  around  the  Symbol  for 
each  item  which  in  itself  made  this  study  an  important  contribution  (or  reduced  the 
value  of  this  research  very  significantly)  and  had  a sizeable  effect  upon  your  over-all 
judgment  of  the  importance  of  this  research. 

5.  Report  on  the  last  page  of  the  Record  Form  the  time  required  to  evaluate  the  research. 
This  will  not  include  time  required  to  read  the  report. 


EXAMPLES  OF  ENTRIES 

1.  The  effective  item  is  not  applicable  to  this  research;  the  reader  cannot  tell  from  the 
report  whether  the  ineffective  item  occurred,  which  in  itself  reduced  the  value  of 
this  research  appreciably. 

NA  Effective  item  Ineffective  item  (?) 

2.  The  effective  item  occurred,  but  w.:  'ot  at  all  important  to  conducting  the  research 

well;  the  ineffective  item  occurred  a:  as  of  some  importance  to  conducting  this 

research  poorly. 

X Effective  item  Ineffective  item  J 

3.  The  effective  item  occurred  and  in  itself  made  this  study  an  important  contribution; 
the  ineffective  item  did  not  occur. 

Effective  item  Ineffective  item  0 

4.  The  effective  item  occurred  and  was  of  some  importance  to  conducting  the  research 
well;  the  ineffective  item  occurred  and  in  itself  reduced  the  value  of  this  research 
appreciably. 

<J  Effective  item  Ineffective  item  _SL 


* 


RECORD  FORM  FOR  EVALUATING  RESEARCH  THROUGH  THE  REPORT 


Title  of  Report 

Author(s) Evaluated  by. 


Effective 


I.  FORMULATING  PROBLEMS  AND  HYPOTHESES 
A.  Identifying  and  Exploring  Problems 


Ineffective 


1.  Investigated  chance  findings,  unex- 
pected results  or  difficulties  en- 
countered in  work  or  mentioned 
significance  of  such  findings. 

2.  Chose  for  investigation  a problem 
for  which  solution  was  urgently 
needed. 

3.  Suggested  a new  problem  which  could 
be  studied  with  an  already  successful 
technique. 

4.  Proposed  an  entirely  new  problem  or 
line  of  research. 

5.  Used  materials  that  had  recently  been 
made  available  to  study  previously 
unsolved  problem. 

6.  Conducted  preliminary  investigation 
to  see  whether  phenomena  merited  ex- 
perimental study  or  to  furnish 
essential  basic  data. 


Failed  to  investigate  chance  findings, 
unexpected  results  or  difficulties 
encountered  in  work  or  failed  to 
mention  significance  of  such  findings. 
Chose  problem  for  which  solution  was 
not  urgently  needed  although  there 
were  urgent  problems  in  his  research 
area. 

Allowed  a successful  technique  to  be 
dropped  without  further  application 
to  new  problems. 

Worked  on  a problem  which  had  already 

been  solved  or  proved  unproductive. 
Failed  to  use  new  methods  or  mater- 
ials which  were  recently  made  avail- 
able to  study  previously  unsolved 
problems. 

Failed  to  conduct  preliminary  investi- 
gation to  see  whether  phenomena 
merited  experimental  study  or  to 
furnish  essential  data. 


1. 


Proposed  investigation  of  basic 
factors  and  implications  involved  in 
the  problem  as  well  as  its  superficial 
aspects. 

2.  Defined  the  problem  and  objectives 
of  investigation. 

3.  Gathered  information  on  exact  re- 
quirements, specifications,  and  goal 
of  project. 

4.  Proposed  investigating  only  factors 
which  could  feasibly  be  studied  under 
existing  practical  limitations. 

5.  Covered  both  theoretical  and  experi- 
mental aspects  of  problem. 


B.  Defining  the  Problem 


Proposed  an  investigation  confined 
to  superficial  aspects  of  problem. 


Did  not  define  problem  or  objectives 
of  investigation. 

Failed  to  obtain  information  needed 
to  define  the  requirements,  specifi- 
cations, and  goal  of  project. 

Chose  a problem  which  did  not  lend 
itself  to  investigation  because  of 
practical  limitations. 

Ignored  either  theoretical  or  experi- 
mental aspects  of  problem. 
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Effective  C.  Setting  Up  Hypotheses  Ineffective 


1.  Proposed  hypothesis  to  direct  research 
or  to  explain  observed  phenomena. 

2.  Proposed  hypothesis  in  agreement 
with  all  known  facts. 

3.  Predicted  phenomena  by  theoretical 
or  mathematical  analysis. 

4.  Extended  a theory  to  cover  a broader 
range  of  problems. 

5.  Reformulated  a theory  to  improve  its 
explanation  of  the  facts. 

6.  Explained  observed  phenomena  through 
theory  or  analogous  situation  in  the 
same  field  or  a related  field. 


Proposed  program  of  data  collection 
undirected  by  any  hypothesis. 

Proposed  hypothesis  contrary  to 
known  facts. 

Failed  to  predict  phenomena  by  theo- 
retical or  mathematical  analysis  when 
it  was  clearly  possible  to  do  this. 

Failed  to  extend  theory  to  cover 
broader  range  of  problems  when 
possible. 

Failed  to  reformulate  a theory  although 
this  would  clearly  have  improved  its 
explanation  of  the  facts. 

Failed  to  explain  observed  phenomena 
through  theory  or  analogous  situation 
in  the  same  field  or  a related  field 
which  was  clearly  applicable. 


II.  PLANNING  AND  DESIGNING  THE  INVESTIGATION 
A.  Collecting  Background  Information 


1.  Sought  out  information  and  ideas  from 
existing  literature,  associates,  or  ex- 
perts on  problem  before  beginning 
work  on  project. 

2.  Included  all  relevant  sources  in  sur- 
veying the  literature  or  consulting 
experts. 

3.  Questioned  the  validity  of  material  in 
the  literature. 

4.  Obtained  needed  information  from  an 

uncommon  source. 

5.  Suggested  that  literature  he  had  read 

in  the  past  might  apply  to  the  current 
problem. 

6.  Performed  experiments  or  gathered 
necessary  information  directly  which 
was  unavailable  in  usual  sources. 


Did  not  consult  those  intimately  con- 
cerned with  problem,  or  investigate 
existing  literature  before  beginning 
work  on  project. 

Omitted  an  important  source  in  sur- 
veying literature  or  consulting 
experts. 

Took  action  based  on  unreliable 
information  in  the  literature  without 
checking. 

Consulted  only  the  most  common 
sources  for  needed  information. 

Ignored  application  of  literature 
which  he  should  have  read  in  the  past 
to  the  current  problem. 

Failed  to  perform  experiments  or 
gather  necessary  information  directly 
which  was  unavailable  in  usual  sources. 


B.  Setting  Up  Assumptions 


1.  Based  research  plan  on  assumptions 
which  closely  approximated  actual 
conditions. 

2.  Secured  evidence  of  validity  of 
assumptions. 

3.  Verified  previous  work  before  basing 
assumptions  on  it. 


Used  a research  plan  dependent  on  _____ 
false  assumptions  or  assumptions 
inapplicable  to  specific  problem. 

Failed  to  secure  evidence  of  validity 
of  assumptions. 

Based  plan  of  investigation  on  opinion  _____ 
or  previous  work  of  others  without 
question. 
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Effective  C.  Identifying  and  Controlling  Important  Variables  Ineffective 


1.  Provided  for  control  and  systematic 
variation  of  all  relevant  variables. 

2.  Made  provision  for  equated  conditions 
in  planning  comparison  tests. 

3.  Simulated  actual  conditions  in  a lab- 
oratory test. 

4.  Treated  the  various  factors  in  accord- 
ance with  their  relative  importance. 

5.  Pointed  out  the  significance  of  a factor 
overlooked  or  dismissed  as  trivial  by 
others. 


Failed  to  provide  for  control  and 
systematic  variation  of  all  relevant 
variables. 

Failed  to  make  provision  for  equated 
conditions  in  planning  comparison 
tests. 

Failed  to  simulate  actual  conditions 
in  a laboratory  test. 

Failed  to  treat  the  various  factors 
in  accordance  with  their  relative 
importance. 

Failed  to  point  out  the  significance 
of  an  important  and  obvious  factor. 


D.  Developing  Systematic  and  Inclusive  Plans 


1.  Included  all  relevant  factors  or  phases 
in  the  investigation. 

2.  Included  methods  of  integrating  one 
factor  or  phase  with  others. 

3.  Pointed  out  the  basic  factors  in  a mass 
of  information  about  the  problem. 

4.  Tried  out  various  approaches  to 
problem  before  choosing  one. 

5.  Studied  each  element  of  problem  in 
proper  sequence. 


Omitted  a relevant  factor  or  included 
an  irrelevant  factor  in  the  investi- 
gation. 

Considered  one  factor  or  phase  in 
isolation  from  related  phases. 

Failed  to  point  out  basic  factors  in 
a mass  of  information  about  the 
problem. 

Did  not  try  out  various  approaches  to 
the  problem  before  choosing  one. 
Studied  elements  of  problem  in 
illogical  sequence. 


E.  Developing  Plans  for  the  Use  of  Equipment,  Materials,  or  Techniques 


1.  Used  equipment,  material,  or  tech- 
niques which  met  the  requirements 
of  the  problem. 

2.  Used  simplified  or  substitute  equip- 
ment, materials,  or  techniques,  which 
met  required  standards  and  saved  time 
or  money. 

3.  Conducted  pilot  study  to  determine 
feasibility  of  proposed  techniques, 
materials,  or  equipment. 

4.  Used  technique  or  equipment  which 
would  eliminate  doubt  of  validity  or 
accuracy  of  results. 

5.  Used  the  latest  development  of  an 
appropriate  equipment,  technique,  or 
material. 

6.  Set  up  work  for  procedure  in  most 
efficient  physical  arrangement  for 
handling  details  easily. 


Used  equipment,  materials,  or  tech- 
niques not  fitted  to  the  requirements 
of  the  problem. 

Used  equipment  or  materials  more 
complex  or  expensive  than  necessary 
to  produce  results  of  required  standards. 

Used  a procedure  that  had  never  been 
tested. 

Used  technique  or  equipment  which 
would  leave  doubt  of  validity  or 
accuracy  of  results. 

Ignored  latest  development  of  an 
appropriate  equipment,  technique,  or 
material. 

Set  up  work  for  procedure  in  an  in- 
efficient physical  arrangement  so  de- 
tails could  not  be  handled  easily. 


\ 

\ 
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Effective 


Anticipating  Difficulties 


1.  Made  provision  for  an  alternate  ap> 

p roach  or  for  handling  difficulties 
which  might  arise  at  later  stages. 

2.  Included  in  plans  internal  or  inde- 
pendent check  on  accuracy  of  data  or 
method. 

3.  Outlined  probable  consequences  of 
various  alternative  approaches. 

4.  Took  special  precautions  in  planning 
to  prevent  damage  to  equipment. 


Made  no  provision  for  handling  dif- 
ficulties which  might  arise  at  later 
stages. 

Failed  to  set  up  internal  or  inde- 
pendent check  on  accuracy  of  data  or 
methods. 

Failed  to  consider  probable  conse- 
quences of  various  alternative 
approaches. 

Failed  to  take  necessary  precautions 
in  planning  to  prevent  damage  to 


equipment. 

G.  Determining  the  Number  of  Observations 


1.  Collected  an  appropriate  quantity  of  Collected  data  which  were  insuffic- 

data  for  the  purpose  of  the  investi-  ient  or  considerably  more  than 

gation.  sufficient  in  quantity. 


Developing  Methods,  Materials,  or  Equipment 


1.  Devised  an  improved  method,  material, 
or  equipment. 

2.  Developed  an  entirely  new  and  effect- 
ive method,  material,  or  equipment 
to  fill  a need. 

3.  Adapted  available  methods,  materials, 
or  equipment  to  meet  requirements  of 
new  problem. 

4.  Showed  experimentally  the  capabilities 
of  method,  material,  or  equipment  he 
developed. 


Developed  a method,  material,  or 
equipment  which  was  less  effective 
or  no  better  than  existing  one. 
Developed  a new  method,  material, 
or  equipment  which  did  not  meet 
assigned  specifications  or  recognized 
need. 

Failed  to  make  proper  adaptation  of  ex- 
isting method,  material,  or  equipment 
to  permit  its  use  in  a specific  problem. 
Failed  to  show  experimentally  the  cap- 
abilities of  method,  material,  or 
equipment  he  developed. 


Methods  and  Techniques 


1.  Applied  critical  tests  of  equipment  or  Used  equipment,  material,  or  tech- 

material  correctly.  nique  incorrectly. 

2.  Used  a technique,  material,  or  'quip-  Failed  to  use  a technique , material,  or 

ment  which  solved  problem  or  < min-  equipment  which  would  have  solved 
ated  difficulty  in  the  investigation  problem  or  eliminated  difficulty  in  the 

investigation. 

3.  Tried  even  unlikely  methods  after  Failed  to  try  unlikely  methods  after 

obvious  methods  had  failed.  obvious  methods  had  failed. 

4.  Demonstrated  that  material,  technique.  Failed  to  suggest  that  material,  tech- 

or  equipment  could  be  used  for  pur-  nique,  or  equipment  could  be  used  for 

poses  other  than  original  ones.  purposes  other  than  original  ones. 

5.  Used  most  accurate  method  of  measure-  Estimated  data  when  accurate  methods 


ment  that  was  available. 

6.  Applied  all  methods  provided  for  in 
plans. 


of  measurement  were  available. 
Failed  to  apply  a method  provided  for 
in  plans. 
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Effective 


C.  Modifying  Planned  Procedures 


Ineffective 


_1.  Modified  standards,  methods,  or  work 
schedule  to  meet  practical  demands 
without  reducing  essential  value  of 
results. 

_2.  Adopted  alternate  procedures  as  soon 
as  unforeseen  negative  conditions  or 
difficulties  were  encountered. 

_3.  Held  up  phases  of  work  until  results 
of  earlier  phase  were  available. 

4.  Accepted  partial  results  or  temporary 
solution  to  an  urgent  problem. 

_5.  Modified  work  to  incorporate  latest 
research  findings. 

_6.  Instituted  a change  which  prevented 
damage,  incorrect  performance,  or 
inaccuracy. 

7.  Did  not  abandon  or  modify  a method  or 
device  until  sufficient  evidence  had 
been  gathered. 

_8.  Modified  all  of  the  materials  or  pro- 
cedures giving  trouble. 

D.  Applyin 

_1.  Presented  unique  solution  or  technique 
developed  by  mathematical  analysis. 

_2.  Transformed  physical  problem  so  that 
it  could  be  solved  by  mathematical 
analysis. 

_3.  Explained  phenomenon  by  analyzing 
procedures  used  or  data  obtained. 

_4.  Solved  a problem  by  application  or 
extension  of  textbook  principles. 

5.  Was  able  to  provide  answer  to  tech- 
nical question,  and  gave  no  incorrect 
information. 

6.  Correctly  interpreted  implications  of 
fundamental  theory  in  explaining  ap- 
plication to  a problem. 

E.  Attending  To  and 

1.  Performed  work  which  met  standards 
for  accuracy. 

_2.  Gave  proper  proportion  of  time  and 
attention  to  small  details  of  procedure. 


Failed  to  modify  standards,  methods, 
or  work  schedule  to  meet  practical  de- 
mands or  omitted  an  essential  step  or 
precaution  for  accuracy  in  modifying 
them. 

Continued  to  follow  old  method  or  ap- 
proach without  change  when  evidence 
showed  it  had  failed. 

Began  work  on  a phase  of  project  with- 
out waiting  for  results  of  earlier  phase. 
Refused  to  accept  partial  results  or 
temporary  solution  to  an  urgent  problem. 
Did  not  modify  work  to  incorporate 
latest  research  findings. 

Instituted  a change  causing  damage, 
incorrect  performance,  or  inaccuracy. 

Abandoned  or  modified  a method  or 
device  before  sufficient  evidence  had 
been  gathered. 

Modified  only  a part  of  the  materials 
or  procedures  giving  trouble. 
g Theory 

Presented  an  erroneous  mathematical 
solution. 

Failed  to  transform  physical  problem 
so  that  it  could  be  solved  by  mathe- 
matical analysis  when  such  transfor- 
mation was  possible. 

Failed  to  explain  phenomenon  by  ana- 
lyzing procedures  used  or  data  obtained. 
Failed  to  solve  a problem  requiring 
only  direct  application  or  elementary 
extension  of  textbook  principles. 

Failed  to  provide  answer  to  technical 
question  or  gave  incorrect  technical 
information. 

Omitted  or  misinterpreted  implications 
of  fundamental  theory  in  explaining 
application  to  a problem. 

Checking  Details 

Performed  work  which  contained  errors 
and  did  not  meet  standards  for  accuracy. 
Gave  disproportionate  time  and 
attention  to  small  details  of  procedure. 
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Effective  F.  Analvzini 

1.  Used  the  most  efficient  method  for 
analyzing  data. 

2.  Used  data  analysis  method  which  was 
well  suited  to  give  required  information. 

3.  Completed  only  analyses  necessary 
for  data  obtained. 

4.  Made  all  necessary  mathematical 
analyses  of  data. 


t the  Data  Inef: 

Used  inefficient  method  for  analyz- 
ing data. 

Used  a data  analysis  method  which 
could  not  give  required  information. 
Completed  an  analysis  unnecessary 
for  data  obtained. 

Failed  to  make  the  necessary  mathe- 
matical analysis  of  data. 


W ‘vTO  ili  Ltrasa  aRI  v 


A.  Evaluatin 

1.  Produced  conclusions  or  recommend- 
ations supported  by  data  and  inter- 
preting results  correctly. 

2.  Drew  conclusions  in  accordance  with 
correct  logical  principles. 

3.  Presented  results  of  a check  on  validity 
of  conclusions. 

4.  Pointed  out  the  limitations  of  data  or 
method,  conflicting  elements,  and 
conclusiveness  of  evidence. 

5.  Presented  logical  explanation  of  unex- 
pected results. 

_6.  Drew  all  conclusions  from  the  data  that 
were  justifiable  and  showed  solution  to 
problem. 

7.  Drew  conclusions  only  from  complete, 
adequate,  and  correct  data. 

B.  Pointing  Out  Im 

1.  Pointed  out  new  and  useful  implica- 
tions and  possible  extensions  of  work. 

2.  Worked  out  applications  to  other 
problems  or  fields. 


; Findings 

Drew  conclusions  not  supported  by 
the  data  and  interpreting  results 
incorrectly. 

Presented  a conclusion  violating 
logical  principles. 

Did  not  present  a check  on  validity 
of  conclusions. 

Did  not  point  out  the  limitations  of 
data  or  method,  conflicting  elements, 
and  conclusiveness  of  evidence. 
Reported  unexpected  results  without 
logical  explanation. 

Failed  to  draw  conclusions  from  data 
or  to  show  solution  to  problem. 

Drew  conclusions  from  incomplete,  in- 
adequate, or  erroneous  data. 
plications  of  Data 

Failed  to  report  implications  and  pos- 
sible extensions  of  work  or  reported 
inapplicable  ones. 

Included  inadequate  discussion  of 
applications  which  should  have  been 
discussed  fully. 


V.  PREPARING  REPORTS 
A.  Describing  and  Illustrating  Work 


1.  Included  important  details  of  proced- 
ure and  results  sufficient  for  checking 
or  repetition  of  work. 

2.  Used  graphic , tabular,  or  pictorial 
material  to  clarify  text. 

3.  Defined  all  terms  and  symbols. 

4.  Used  simple,  direct  language,  concrete 
words,  and  correct  English  usage. 


Failed  to  include  important  details 
of  procedure  and  results  sufficient  for 
checking  or  repetition  of  work. 

Omitted  necessary  illustrative  material. 

Presented  ambiguous  definition  of  terms 
or  symbols  or  failed  to  define  them. 
Made  excessive  use  of  complex  sen- 
tence structure  and  unusual  words  or 
violated  correct  English  usage. 
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Effective 

5.  Kept  statement  of  problem  and  con- 
clusions brief  enough  for  quick  grasp. 

6.  Explained  new  material  when  first 
introduced. 

7.  Gave  examples  of  practical  applic- 
ations or  a simplified  statement  of 
complex  theory. 

8.  Used  a simple  system  of  designation 

for  items. 

9.  Defined  purpose  of  report  or  project 
explicitly. 

10.  Gave  proper  emphasis  to  major  and 
unimportant  findings. 

11.  Limited  treatment  of  elementary  theory 
or  well  known  materials  to  a brief  dis- 
cussion and  gave  only  necessary  detail 
in  descriptions. 

12.  Used  only  accurate  labels  on  illus- 
trative material,  accurate  references, 
and  correct  symbols. 


Ineffective 

Made  statement  of  problem  and  conclu- 
sions unnecessarily  long  and  involved. 
Failed  to  explain  or  define  new 
material. 

Failed  to  give  examples  of  practical  ___ 
applications  or  a simplified  statement 
of  complex  theory. 

Used  an  unnecessarily  complex  system  _ 
of  designation  for  items. 

Failed  to  define  purpose  of  report  or 
project  explicitly. 

Failed  to  emphasize  major  findings 
or  overemphasized  unimportant  findings. 
Discussed  at  length  elementary  theory 
or  well  known  materials  or  gave  ex- 
cessive detail  in  descriptions. 


Used  inaccurate  labels  on  illustrative 
material,  inaccurate  references, 
incorrect  symbols. 


or 


1.  Described  fully  the  basic  principles 

involved. 

2.  Gave  explicit  statement  of  underlying 

assumptions  and  inferences. 

3.  Included  detailed  reasoning  leading 

to  conclusions  presented. 

4. Gave  an  especially  complete,  relevant 

bibliography. 

5.  Presented  proof  for  unusual  theory. 


B.  Substantiating  Procedures  and  Findings 


Failed  to  give  sufficient  background  or 
theory  for  full  understanding. 

Failed  to  state  underlying  assumptions. 

Failed  to  present  material  on  which 
conclusions  were  based. 

Omitted  necessary  bibliography  or 
failed  to  give  full  reference  to 
related  work  area. 

Gave  abstruse  theory  without  present- 


ing proof. 

6.  Gave  derivations  of  all  but  very  common  Failed  to  give  derivations  of  equations 

equations  or  formulas.  or  formulas. 

C,  Organizing  the  Report 

l.Gave  problem  and  introductory  mater-  Failed  to  give  plan  and  scope  of  the 

ial  at  the  beginning  of  the  report.  problem  ?t  the  beginning  of  the  report. 

2.  Summarized  the  important  points.  Failed  to  bring  out  the  main  points. 

3.  Followed  a logical  outline.  Separated  related  sections  or  mater- 
ials or  jumped  from  one  point  to 
another. 

Gave  technical  details  in  body  of 
report. 

Mixed  background  material  or  dis- 
cussion with  presentation  of  method. 
Placed  references  in  an  inapprop- 
riate location. 


4.  Placed  lengthy  or  detailed  analysis  or 

data  in  appendix. 

5. Separated  background  material  or  dis- 
cussion from  presentation  of  method. 

6.  Placed  references  in  appropriate  loc- 
ation. 
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Effective 


7.  Presented  figures  or  tables  in  order 
corresponding  to  text. 

8.  Used  a logical  order  in  tabulating 
data  and  presenting  conclusions. 

D.  Using  Appropriate 

1.  Used  a style  adapted  to  probable 
readers. 

2.  Heightened  interest  and  stimulated 
thought  by  skillful  manner  of  pre- 
sentation. 

* * 


Ineffective 

Presented  figures  or  tables  in  order  __ 
not  corresponding  to  text. 

Used  an  illogical  order  in  tabulating 
data  or  presenting  conclusions. 

Style  in  Presenting  Report 

Used  an  unduly  informal  style  or  style  ^ 
inappropriate  for  readers. 

Presented  material  in  an  uninterest- 
ing and  unstimulating  way. 

* * * 


We  would  like  you  to  make  an  over-all  judgment  about  the  research  you  have  just  evalu- 
ated. In  making  this  judgment  it  might  be  helpful  to  consider  whether  you  think  the  time 
and  money  required  for  the  project  were  well  spent,  whether  the  research  made  a signi- 
ficant contribution  to  knowledge  in  the  field,  whether  findings  of  this  study  were  sub- 
stantiated by  later  research,  and  whether  you  would  recommend  that  other  persons  read 
this  report.  It  is  especially  important  to  keep  in  mind  that  negative  results  do  not  neces- 
sarily mean  the  research  made  no  significant  contribution  to  knowledge  in  the  field. 
Taking  these  points  into  consideration,  please  check  the  one  statement  below  which  you 
consider  to  be  most  true.  These  statements  should  be  considered  definitions  of  five 
points  on  a continuum. 

This  research  made  an  especially  significant  contribution  to 

knowledge  in  the  field.  I would  strongly  recommend  that  all 
persons  in  the  field  read  this  report. 

This  research  made  a relatively  significant  contribution  to 

knowledge  in  the  field.  I would  recommend  that  interested 
persons  in  the  field  read  this  report. 

This  research  made  a small  but  definite  contribution  to  know- 
ledge in  the  field.  I would  suggest  that  interested  persons  in 
the  field  might  find  this  report  of  some  value. 

This  research  contributed  almost  nothing  to  knowledge  in  the 

field.  I would  recommend  to  few  persons  in  the  field  that  they 
read  this  report. 

This  research  contributed  nothing  to  knowledge  in  the  field  and 

probably  has  misled  some  workers  in  the  field.  I would  never 
recommend  that  other  persons  in  the  field  read  this  report. 


You  should  now  circle  important  items  as  de- 
scribed under  3 and  4 of  the  instructions. 


How  much  time  did  you  spend  completing  the  Record  Form  (including  circling  of  import 
ant  items,  but  not  including  time  spent  in  reading  the  report)? 

Comments:  (e.  g,  important  factors  not  covered  or  difficulties  in  using  the  procedure) 
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(Use  back  of  sheet  if  necessary) 


Results  on  the  Trial  Record  Form 


The  frequency  with  which  each  of  the  seven  response  symbols  (defined  in 
the  Instructions  for  Evaluators)  was  used  or  a blank  space  was  left  was  tabulated 
for  each  of  the  228  items,  114  effective  and  114  ineffective.  This  was  done  separ- 
ately for  the  144  evaluations  of  reports  selected  as  examples  of  especially  effective 
research  and  the  144  evaluations  of  reports  selected  as  examples  of  relatively 
mediocre  research.  Tabulations  for  each  item  were  also  made  separately  for  physics, 
chemistry,  and  engineering  reports.  Results  for  the  three  disciplines  were  suffic- 
iently similar  that  the  data  were  combined  for  further  analysis. 

Table  II  shows  the  relative  frequency  with  which  each  symbol  was  used  for 
all  items  for  reports  selected  as  representing  especially  effective  and  relatively 
mediocre  research.  Results  are  shown  separately  for  effective  and  ineffective  items. 

Table  II 

Frequency  With  Which  the  Various  Symbols 
Were  Used  on  the  Trial  Record  Form 


Effective  Items 

Ineffective  Items 

Effective 

Mediocre 

Effective 

M ediocre 

Research 

Research 

Research 

R esearch 

Symbol* 

Number 

Percent 

Number  Percent 

Number  Percent 

Number  Percent 

0 

399 

2.4 

224  1.4 

38 

0.  2 

83 

. 5 

0 

1099 

6.7 

946  5. 8 

117 

0.7 

190 

1.  2 

y 

6978 

42.  5 

6027  36.7 

514 

3.  1 

7 25 

4.4 

X 

1288 

7.8 

1424  8.7 

632 

3.8 

634 

3.9  | 

0 

2763 

16,  8 

3468  21.1 

11516 

70.  1 

107  39 

65.  4 

? 

1604 

9.  8 

1804  11.0 

1503 

9.  2 

1685 

10.  3 

NA 

2268 

13.  8 

2498  15.2 

1949 

11.9 

2223 

13.  5 

Blank 

17 

0.  1 

25  . 2 

147 

0.9 

137 

* Definitions  of  Symbols 

= The  item  occurred  and  in  itself  made  the  study  an  important  contribution 
(or  reduced  the  value  of  the  research  very  significantly). 

= The  item  occurred  and  in  itself  made  the  study  of  definite  value  (or  reduced 
the  value  of  the  research  appreciably), 

= The  item  occurred  and  was  of  some  importance  to  conducting  the  research 
well  or  poorly. 


(2) 


X = The  item  occurred  but  was  not  at  all  important  to  conducting  the  research 
well  or  poorly. 

0 = The  item  did  not  occur. 

? = The  item  might  have  occurred,  but  the  reader  cannot  tell  from  the  report. 

NA  = The  item  could  not  have  occurred  because  such  activity  is  not  applicable  to 
the  research. 

Blank  = The  evaluator  failed  to  place  one  of  the  above  symbols  in  the  space  beside 
the  item. 


\ 
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It  may  be  seen  from  Table  II  that  effective  items  were  checked  somewhat 
more  frequently  as  having  occurred  for  reports  selected  as  representing  especially 
effective  research  whereas  ineffective  items  were  checked  somewhat  more 
frequently  as  having  occurred  for  reports  selected  as  representing  relatively 
mediocre  research.  It  may  also  be  noted  that,  for  all  reports,  the  ineffective 
items  were  checked  much  less  frequently  than  the  effective  items.  These  same 
trends  hold  for  the  "check-circle"  and  "check-double  circle"  responses.  (These 
symbols  are  explained  in  the  footnote  of  Table  II. ) 

On  the  last  page  of  the  trial  Record  Form  evaluators  were  asked  to  make 
an  over -all  judgment  about  the  research  they  had  evaluated  on  the  228  items  on 
the  form.  The  results  of  these  judgments  are  shown  separately  in  Figure  1 for 
reports  selected  as  especially  effective  and  relatively  mediocre. 

It  may  be  seen  from  Figure  1 that  research  which  was  selected  as  especially 
effective  tended  to  be  judged,  through  the  report,  to  have  made  a greater  contri- 
bution to  knowledge  in  the  field  than  research  selected  as  relatively  mediocre. 

There  were  sometimes,  however,  wide  discrepancies  between  selectors'  and 
evaluators'  judgments.  It  may  be  noted,  for  example  that  in  eight  cases  where 
the  research  had  been  selected  as  relatively  mediocre  it  was  judged  to  have  made 
an  especially  significant  contribution  to  knowledge  in  the  field  and  that  in  two  cases 
where  the  research  was  selected  as  especially  effective  it  was  judged  to  have  con- 
tributed nothing  to  knowledge  in  the  field.  The  amount  of  agreement  between  selectors' 
and  evaluators'  judgments  may  be  expressed  by  a point -biserial  correlation 
coefficient  of  +.  36  (N=287). 

In  Figure  1,  as  with  Table  II,  it  may  be  seen  that  there  i3  a very  strong  ten- 
dency for  reports  of  both  effective  and  mediocre  research  to  be  evaluated  more 
favorably  than  unfavorably.  That  is,  effective  items  were  checked  as  having 
occurred  far  more  frequently  than  ineffective  items  (Table  II',  and  far  more  research 
was  judged  to  have  made  a significant  contribution  to  knowledge  in  the  field  than  was 
judged  to  have  made  little  or  no  contribution.  This  is  to  be  expected  since  even 
the  relatively  mediocre  research  was,  usually,  of  sufficient  value  to  be  accepted 
for  publication  in  a professional  journal.  It  could  be  expected,  therefore.,,  that  the 
relatively  mediocre  research  in  this  samp  e would  not  typify  the  least  productive 
research  conducted.  It  would  rather  be  expected  to  typify  some  of  the  least  pro- 
ductive research  which  is  published  in  the  literature.  It  seems  likely  that,  on  the 
average,  the  less  productive  research  remains  unpublished. 

It  has  previously  been  mentioned  that  each  of  86  research  studies  was  in- 
dependently evaluated  by  two  persons.  Of  the  pair  of  evaluations  for  each  report  one 
was  randomly  designated  as  evaluation  A and  the  other  as  evaluation  B.  For  pur- 
poses of  the  present  analysis  there  were  85  pairs  of  evaluations  since  one  evaluator 
failed  to  make  an  over-all  judgment  about  the  contribution  of  the  research.  Figure  2 
shows  the  extent  of  agreement  between  evaluation  A and  evaluation  B on  over-all 
judgments  about  the  contribution  of  the  research. 


Figure  1 


Distribution  of  Over-all  Judgments 
About  the  Value  of  the  Research 


The  Roman  numerals  represent  judgments  about  the  research  as  defined  below. 
Arabic  numerals  indicate  the  number  of  evaluators  making  each  judgment. 


Reports  Selected  as  Representing 
Effective  Research 
50%  40  30  20  10  0 


♦Reports  Selected  as  Representing 
Mediocre  Research 

0 10  20  30  40  50% 


I  This  research  made  an  especially  significant  contribution  to  knowledge  in  the 
field.  I would  strongly  recommend  that  all  persons  in  the  field  read  this  report. 

II  This  research  made  a relatively  significant  contribution  to  knowledge  in  the  field. 
I would  recommend  that  interested  persons  in  the  field  read  this  report. 

III  This  research  made  a small  but  definite  contribution  to  knowledge  in  the  field. 

I would  suggest  that  interested  persons  in  the  field  might  find  this  report  of  some 
value. 

IV  This  research  contributed  almost  nothing  to  knowledge  in  the  field.  I would 
recommend  to  few  persons  in  the  field  that  they  read  this  report. 

V  This  research  contributed  nothing  to  knowledge  in  the  field  and  probably  has  mis- 
led some  workers  in  the  field.  I would  never  recommend  that  other  persons  in 
the  field  read  this  report. 


* One  evaluator  of  a report  representing  mediocre  research  did  not  make  an  over 
ali  judgment  making  the  total  143. 
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Figure  2 


Agreement  Between  Independent  Judgments 
Of  the  Contribution  Made  by  Research 

Roman  numerals  indicate  judgments  as  defined  in  Figure  1.  Arabic  numerals  inside 
the  small  boxes  indicate  the  number  of  reports  with  particular  combinations  of  over- 
all judgments  by  the  A and  B evaluators.  Arabic  numerals  to  the  right  of  the  large 
square  indicate  the  number  of  reports  given  each  type  of  judgment  by  B evaluators, 
and  those  under  the  large  square  indicate  the  number  given  each  type  of  judgment  by 
A evaluators.  Frequencies  inside  the  heavy  boxes  represent  perfect  agreement 
between  evaluators.  Those  in  light  boxes  indicate  less  than  perfect  agreement.  The 
farther  light  boxes  are  from  the  nearest  heavy  black  box,  the  greater  the  extent  of 
disagreement.  The  frequency  expected  by  chance  in  each  box  is  indicated  in 
parentheses  in  the  lower  right  hand  corner  of  the  box. 
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It  may  be  seen  from  Figure  2 that  agreement  in  over-all  judgment  between 
two  independent  evaluations  of  research  through  the  report  is  substantially  better 
than  chance,  but  is  certainly  far  from  perfect.  The  extent  of  agreement  is  roug  y 
about  the  same  as  between  evaluators'  over-all  judgment  and  selectors'  designation 
as  especially  effective  or  relatively  mediocre  research.  The  data  shown  in 
Figure  2 yield  a product-moment  correlation  coefficient  of  +.  35.  The  point- 
biserial  correlation  between  the  over-all  judgment  with  the  selectors'  effective- 
mediocre  judgment  was  +.  31  for  the  85  A evaluators  and  +.  27  for  the  85  B evaluators. 


Evaluators'  Comments 

About  half  of  the  evaluators  made  one  comment  or  more  about  the  trial 
Record  Form.  Each  comment  was  listed  separately  and  an  attempt  was  made  to 
classify  all  comments  into  relatively  homogeneous  areas.  The  main  and  sub -headings 
of  classification  are  shown  below:* 

Evaluators'  Comments  Concerning  the  Trial 
Record  Form  for  Evaluating  Research 
Through  the  Report 

(Number  in  parentheses  after  a given  sub-heading  indicates  the  number  of  evaluators 
making  comments  in  this  area.  ) 

I.  General  Comments 

A.  Record  Form  is  well  done.  (5) 

B.  Record  Form  is  not  well  conceived.  (2) 

C.  Not  completing  form  because  of  its  length  or  complexity.  (11) 

D.  Evaluation  procedure  is  subjective.  (4) 

E.  This  type  of  evaluation  requires  very  specialized  knowledge.  (3) 

F.  Form  is  too  long.  (12) 

G.  Procedure  not  applicable  to  the  type  of  paper  being  evaluated.  (49) 

II.  Evaluator's  Task 

A.  Instructions  are  not  clear.  (6) 

B.  Form  should  be  scored  twice  (tentatively  the  first  time).  (2) 

C.  Not  enough  room  is  left  on  forms  for  author  and  title.  (2) 

III.  The  Items 

A.  Items  are  not  relevant  to  useful  evaluation.  (6) 

B.  Designation  of  item  as  effective  or  ineffective  is  wrong  or  confusing.  (9) 

C.  It  is  difficult  or  impossible  to  answer  certain  of  the  items.  (10) 

D.  Statement  of  items  is  ambiguous  or  incomplete.  (26) 

E.  Items  overlap.  (5) 


* This  outline  is  reproduced  in  the  technical  appendices  to  this  report  with  examples 
under  each  sub-area. 


7 


l 


F.  Procedure  needs  additional  items.  (7) 

G.  Items  are  in  the  wrong  place.  (1) 

IV.  The  Symbol  Scoring  System 

A.  Scoring  system  should  provide  for  degrees  of  effectiveness  and  in- 
effectiveness. (8) 

B.  Symbol  system  is  ambiguous  or  inadequate.  (5) 

C.  Additional  symbols  are  needed.  (6) 

D.  It  is  difficult  to  decide  which  symbol  should  be  used.  (11) 

The  large  majority  of  critical  comments  seemed  to  stem  from  two  related 
sources,  the  length  and  complexity  of  the  trial  form  and  the  inappropriateness  of 
the  form  for  the  type  of  report  being  evaluated.  The  most  common  single  comment 
dealt  with  the  point  that  published  reports,  and  especially  journal  articles,  must 
of  necessity  be  greatly  condensed.  Many  of  the  detailed  questions  posed  by  the 
trial  Record  Form  concerning  the  research  could  not,  consequently,  be  answered 
adequately  from  the  report.  This  problem  and  the  necessity  for  using  published 
research  were  discussed  in  Chapter  II.  The  length  and  complexity  of  the  trial 
Record  Form  result  from  the  developmental  nature  of  the  present  research.  A 
large  number  of  items  was  included  on  the  trial  form  to  provide  empirical  evidence 
for  selecting  only  the  better  items  for  the  revised  Record  Form. 

It  is  thought  that  both  of  the  major  points  criticized  on  the  trial  form  -- 
length  and  inappropriateness  of  many  items  for  journal  articles  --  have  been 
eliminated  to  a large  extent  in  the  revised  form,  since  the  aim  of  the  revision  was 
to  select  from  the  trial  form  only  those  items  which  functioned  adequately  in  the 
present  study.  The  revised  Record  Form  is  much  shortened  and  composed  only 
of  the  most  cogent  items  from  the  trial  Record  Form.  The  comments  and  suggestions 
of  evaluators,  as  well  as  the  statistical  and  other  considerations  discussed  in  the 
following  section,  were  carefully  considered  in  revising  the  Record  Form. 


Item  Analysis  and  Revision  of  the 
Trial  Record  Form 

For  purposes  of  the  item  analysis  the  eight  possible  responses  to  each 
item  were  grouped  into  two  classes  --  checks,  including  /,  Q , and@  , and 
non-checks,  including  Blank,  NA,  ?,  0,  and  X.  The  number  of  times  symbols  in 
each  class  were  used  for  each  item  was  tabulated  separately  for  research  selected 
as  especially  effective  and  research  selected  as  relatively  mediocre.  A phi  (0) 
coefficient  of  correlation  was  computed  for  each  item  between  the  type  of  selection 
(as  especially  effective  or  relatively  mediocre)  and  the  "check-non-check  variable.  " 
The  computation  of  a phi  coefficient  from  these  data  for  one  effective  item  is 
illustrated  in  Table  III. 
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Table  III* 


Relationship  Between  Selectors'  Judgments  of  Research  Effectiveness 
and  Evaluators'  Responses  to  Item  I-A-2  (Effective) 


Number  of  reports 

"Especially  "Relatively 

Effective"  Mediocre" 

Totals 

i Checked  ( J , 0 , and  (^)) 

82  60 

(a)  (b) 

142 

(a+b) 

Not  checked  (Blank,  NA,  ?,  0,  and  X) 

62  84 

(c)  (d) 

146 

(c+d) 

Totals 

144  144 

(a+c)  (b+d) 

288 

(a+b+c+d) 

(ad]  - (be) 

= ^ (a+b)  (c+d)  (a+c)  (b+d) 

(6888)  - (37  20) 

\J(  1 42)  (146)  (144)  (144) 


The  obtained  values  for  ineffective  items  were  reflected  (plus  changed  to 
minus  and  minus  changed  to  plus)  so  a positive  phi  coefficient  would  indicate  an 
item,  whether  effective  or  ineffective,  functioning  in  the  desired  manner.  The 
distribution  of  phi  coefficients  obtained  for  the  114  effective  items  on  the  trial 
Record  Form  is  shown  in  Figure  3.  The  distribution  of  phi  coefficients  obtained 
for  the  114  ineffective  items  is  shown  in  Figure  4. 

Figures  3 and  4 show  that  the  average  phi  coefficient  for  effective  items 
and  that  for  ineffective  items  both  discriminate  between  reports  of  especially 
effective  research  and  reports  of  relatively  mediocre  research.  A comparison  of 
Figures  3 and  4 reveals  that  effective  items,  in  general  discriminate  to  a greater 
degree  than  ineffective  items. 


* The  greater  the  excess  oi  checks  for  effective  reports  over  those  for  mediocre 
reports  the  higher  positive  will  be  the  phi  coefficient.  The  greater  the  excess 
of  checks  for  mediocre  reports  over  those  for  effective  reports  the  higher 
negative  will  be  the  phi  coefficient.  The  limiting  values  of  phi  are  plus  and  minus 
one. 
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Figure  3 

No.  of  0 Distribution  of  Validity  (0)  Coefficient*  Obtained  for  the 

Coefficient*  114  Effective  Item*  on  the  Trial  Record  Form 


The  mean  phi  coefficients  for  the  effective  and  for  the  ineffective  items  in 
each  of  the  five  main  performance  areas  on  the  trial  form  were  computed.  The 
results  are  shown  below: 


\ 

1 

Area 

Mean  Validity  for 
Effective  Items 

Mean  Validity  for 
Ineffective  Items* 

I.  Formulating  Problems  and  Hypotheses 

. 11 

. 06 

II.  Planning  and  Designing  the  Investiga  tion 

. 07 

. 05 

III.  Conducting  the  Investigation 

. 09 

. 03 

IV.  Interpreting  Research  Results 

. 09 

. 10 

V.  Preparing  Reports 

. 08 

. 03 

* Item  validities  reflected* 


The  mean  validity  coefficients  for  the  various  areas  seem  to  be  fairly 
similar.  A further  study  of  area  scores  and  interrelationships  would  have  been  de- 
sirable, but  was  beyond  the  scope  of  the  present  study* 

The  obtained  phi  coefficients  provided  one  basis  for  selecting  items  for  the 
revised  Record  Form.  Another  consideration  in  selecting  items  was  the  extent  to 
which  each  item  correlated  with  other  items  selected  for  the  revised  Record  Form. 
Other  things  being  equal  the  higher  the  correlation  of  an  item  with  other  selected 
items  the  less  will  be  its  contribution  to  the  validity  of  the  selected  items.  Items 
with  low  positive  or  negative  validity  would  not  contribute  much  in  any  case,  unless 
they  were  given  special  weights  which  did  not  seem  justified  in  this  study.  There- 
fore, all  items  with  a phi  below  +.08  were  excluded  from  further  analysis. 

There  were  65  effective  and  34  ineffective  items  with  phi's  of  +.08  or 
above,  A score  based  on  these  items  was  obtained  for  each  of  the  288  trial  Record 
Forms  completed  in  the  tryout  of  the  form.  The  total  score  was  obtained  by  count- 
ing each  check  (including  J (/)  , ana  (^7)  ) of  an  effective  item  as  +1  and  each  check 
of  an  ineffective  item  as  -J  ..  The  mean  number  of  effective  items  checked  was  35 
and  the  mean  number  of  ineffective  items  checked  was  two,  Thus,  the  mean  average 
total  score  was  +33.  The  standard  deviation  for  effective  items  was  12.7,  for  in- 
effective items  was  3,7,  and  for  the  total  score  was  14.6,  The  product-moment 
correlation  between  the  number  of  effective  items  checked  and  the  number  of  inef- 
fective items  checked  was  - 39.  The  product-moment  correlation  between  the 
number  of  effective  items  checked  and  the  total  score  was  +.96  and  the  product- 
moment  correlation  between  the  number  of  ineffective  items  checked  and  the  total 
score  was  58. 

The  point  b’ serial  correlation  ox  each  item  with  the  total  score  was  obtained. 
This* provided  an  estimate  of  the  average  overlap  of  each  item  entering  into  the  total 
score  with  all  other  items  in  the  total  score.  This  correlation  was  spurious  to  a 
slight  extent  since  each  item  for  which  the  correlation  was  obtained  was  also  in- 
cluded in  the  total  score*  The  range  of  point  biserial  coefficients  for  the  65  effective 
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items  with  the  total  score  was  from  +.11  to  +.  60  with  a median  of  +.  44.  The  range 
of  point  biserial  coefficients  for  the  34  ineffective  items  with  the  total  score  was  from 
-.11  to  -.40  with  a median  of  26. 


The  point  biserial  correlation  between  total  score  for  the  99  items  (65 
effective  and  34  ineffective)  and  the  effective -mediocre  criterion  was  obtained.  * 
A weight  was  computed  for  each  item,  showing  the  relative  contribution  of  the 
item  in  predicting  the  criterion  in  conjunction  with  the  total  score.  This  index 
was  computed  from  the  formula:** 


^ ci.  t = 


«ic-<rct>  (rit> 


1 - <rit> 


where  t is  the  beta  weight  for  a given  item  (i)  when  the  criterion  is  predicted 

by  item  and  total  score, 

(j^c  is  the  phi  coefficient  of  correlation***  between  criterion  and  given  item 

(i) 

rct  is  the  point  biserial  correlation  between  criterion  and  total  score  (constant 
for  all  items),  and 

r^t  is  the  point  biserial  correlation***  between  total  score  and  a given 
item  (i). 


It  may  be  seen  from  this  formula  that  the  weight  for  a given  item  depends 
upon  both  its  correlation  with  the  criterion  and  with  a composite  of  other  items. 

For  effective  items  the  greater  the  validity  phi  and  the  smaller  the  item -composite 
correlation  the  larger  the  beta  weight  becomes.  For  ineffective  items  this  is  also 
true  but  since  both  validity  phi  and  item-total  score  correlation  are  negative,  the 
beta  weight  will  also  have  a negative  sign.  That  is,  the  size  of  the  negative  beta 
becomes  larger  as  the  negative  validity  coefficient  increases  and  the  item-total 
score  correlation  approaches  zero.  We  would,  of  course,  expect  that  the  occurrence 
of  an  ineffective  item  of  behavior  should  have  negative  weight  in  the  evaluation  of 
research. 


* This  coefficient  was  +.  30,  but  this  cannot  be  considered  an  estimate  of  the  total 
score  validity  since  individual  items  were  selected  for  this  score  on  the  basis 
of  their  validities. 

**  This  is  an  application  of  the  usual  formula  for  a beta  weight  in  the  three 
variable  problem. 

***  The  signs  of  these  coefficients  were  not  reflected  for  purposes  of  this  com- 
putation. 

I 
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The  obtained  beta  weight  comments  of  evaluators  the  number  of  times  the 
item  was  checked,  and  a review  of  each  item  in  terms  of  whether  it  would  logically 
be  expected  to  function  m the  desired  manner  were  all  considered  in  selecting  items 
for  the  revised  Record  Form,  The  first  consideration  was  whether  both  effective 
and  ineffective  items  should  be  selected  for  the  revised  form  or  whether  it  would  be 
sufficient  to  use  only  one  type  of  item.  The  relatively  small  number  of  times  in- 
effective items  were  checked  in  comparison  with  effective  items  suggested  that  the 
ineffective  items  might  make  little  additional  contribution  to  the  effective  items. 

It  was  found  that  the  point  biserial  correlation  between  total  score  for  the 
65  effective  items  with  a validity  phi  of  +.  08  or  greater  and  the  effective -mediocre 
criterion  was  increased  only  t.  02  by  adding  the  score  for  the  34  ineffective  items 
to  the  score  for  effective  items  using  the  beta  weights*  for  combining  the  scores. 

It  was  therefore  concluded  that  it  was  probably  not  necessary  to  select  ineffective 
items  for  the  revised  Record  Form, 

On  the  basis  of  the  considerations  mentioned  above,  50  of  the  65  trial 
Record  Form  effective  items  with  a validity  phi  of  4.  08  or  greater  were  selected 
for  the  revised  Record  Form  Four  of  the  50  items  were  revised  slightly,  largely 
on  the  basis  of  suggestions  made  by  evaluators. 


Reliability  of  the  Revised  Record  Form 

Since  each  of  86  reports  was  evaluated  independently  by  two  persons,  it 
was  possible  to  estimate  the  inter-evaiuator  reliability  of  a score  obtained  from  the 
50  selected  items.  The  report  on  which  one  evaluator  did  not  make  an  over-all 
judgment  was  not  included  in  this  analysis  so  results  would  be  directly  comparable 
with  those  for  the  inter -evaluator  reliability  of  the  over-all  judgment.  The  product- 
moment  correlation  between  the  two  independent  evaluations  of  the  85  reports  on  the 
50  selected  items  was  +.  16,  This  may  be  compared  with  the  product-moment 
correlation  coefficient  of  +.  35  between  two  independent  over-al  judgments  of  the 
same  reports  by  the  same  85  pairs  of  evaluators. 

The  per  cent  of  agreement  between  the  evaluators  on  each  of  the  50  items 
was  computed.  The  per  cent  of  agreement  expected  by  chance'*1*  was  subtracted 


* The  beta  weight  for  the  effective  score  was  4.  25  and  the  beta  weight  for  the  in- 
effective score  was  ,12. 


The  expected  proportion  of  agreement  for  each  item  was  computed  from  the 

formula  p p 4-  q q expected  proportion  of  agreement 

A B A B 


where 


p^  is  the  proportion  of  A evaluators  checking  the  item  (one  of  each  pair 

of  evaluators  evaluating  a given  report  was  randomly  designated 
A,  the  other  was  designated  B) 

Pg  is  the  proportion  of  B evaluators  checking  the  item 
is  1.00  - p^  and 
qB  is  1. 00  - pB. 
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from  this  obtained  per  cent  of  agreement  to  yield  the  per  cent  difference  from 
chance  agreement.  The  tally  of  results  for  the  50  items  is  shown  below: 


Per  Cent  Difference 

From  Chance  Agreement  Number  of  Items 


+ . 18  to  +.  20  3 

+.  15  to  +.  17  3 

+.  12  to  +.  14  1 

+.  09  to  +.  1 1 6 

+ . 06  to  +.  08  5 

+.  03  to  +.  05  16 

. 00  to  +.02  12 

-.03  to  -.01  4 


Yr  50 


The  extent  of  agreement  between  evaluators'  over-all  judgment  and  their 
responses  may  be  indicated  by  a product  moment  correlation  betweeh  their  total 
score  on  the  50  items  and  their  over-all  judgment.  For  the  85  A evaluators  this 
correlation  was  +.  46  and  for  the  85  B evaluators  it  was  +.  36. 


An  estimate  of  the  internal  consistency  of  the  composite  of  50  items  was 
obtained  for  both  the  A evaluators  of  the  85  reports  and  for  the  B evaluators.  The 
Kuder- Richardson  Formula  (20)  estimate  for  the  A evaluators  was  . 88  and  for  the 
B evaluators  was  . 89.  It  may  be  seen  from  these  estimates  that  the  evaluators 
show  very  substantial  self-consistency  in  checking  the  50  items.  This  indicates 
that  different  items  are  measuring  essentially  the  same  thing  and  hence  duplicating 
each  other  in  the  function  measured.  However,  the  very  low  inter -evaluator 
reliability  would  indicate  that  standards  for  checking  items  vary  considerably  from 
one  evaluator  to  another. 

The  revised  Record  Form  for  Evaluating  Research  Through  the  Report  is 
shown  on  pages  32-35. 


RECORD  FORM  FOR  EVALUATING  RESEARCH  THROUGH  THE  REPORT 


Title  of  Report 

Author  (s) 

Evaluated  bv  

INSTRUCTIONS 

1.  Evaluate  the  research  by  completing  this  form.  It  is  organized  as  an  outline  with 
headings  and  numbered  items  below  them.  Each  item  describes  a specific  area  of 
effective  research  performance.  There  is  a blank  space  beside  each  item. 

Place  one  of  the  following  symbols  in  each  blank  space  to  indicate  whether 
there  is  evidence  in  the  report  that  effective  performance  occurred  in  the  area 
described: 

NA  - This  item  could  not  have  occurred  because  such  activity  is 
not  applicable  to  this  research. 

? = This  item  might  have  occurred,  but  the  reader  cannot  tell 
from  this  report. 

0 = This  item  did  not  occur. 

X = This  item  occurred,  but  was  not  at  all  important  to  the 
effective  conduct  of  this  research. 

J - This  item  occurred  and  was  of  some  importance  to  the 
effective  conduct  of  this  research. 

IMPORTANT:  THE  SPACE  BESIDE  EVERY  ITEM  SHOULD 

BE  MARKED  WITH  SOME  SYMBOL. 

2.  Complete  the  section  on  the  last  page  of  the  Record  Form  which  asks  for  your 
over -all  judgment  of  the  contribution  of  the  research. 

3.  Place  a circle  around  the  symbol  for  each  item  which  in  itself  made  this  study  an 
important  contribution  (or  reduced  the  value  of  this  research  very  significantly) 
and  had  a sizeable  effect  upon  your  over -all  judgment  of  the  importance  of  this 
research.  Items  marked  ? or  0,  as  well  as  checked  items,  maybe  circled  since 
inadequate  reporting  or  lack  of  effective  performance  may  significantly  reduce 
the  value  of  research. 


EXAMPLES  OF  ENTRIES 

1.  © The  item  occurred  and  in  itself  made  the  study  an  important 

contribution. 

2.  (?)  It  cannot  be  told  from  the  report  whether  the  item  occurred; 

this  in  itself  reduced  the  value  of  the  research  very  significantly. 

3.  © The  item  did  not  occur;  this  in  itself  reduced  the  value  of  the 

research  very  significantly. 


I,  FORMULATING  PROBLEMS  AMB-HYPOTHfigES 


? 


jf 


1.  Chose  for  investigation  a problem  for  which  solution  would  be  a valuable 
contribution. 

2.  Proposed  an  entirely  new  problem  or  line  of  research. 

3.  Used  materials  that  had  recently  been  made  available  to  study  previously 
unsolved  problem. 

4.  Conducted  preliminary  investigation  to  see  whether  phenomena  merited 
experimental  study  or  to  furnish  essential  basic  data. 

5.  Proposed  investigation  of  basic  factors  and  implications  involved  in  the 
p roblem  as  well  as  its  superficial  aspects. 

b.  Gathered  information  on  exact  requirements  specifications,  and  goal  of 
project. 

7.  Covered  both  theoretical  and  experimental  aspects  of  problem. 

8.  Proposed  hypothesis  to  direct  research  or  to  explain  observed  phenomena. 

9.  Proposed  hypothesis  in  agreement  with  all  known  facts. 

10.  Predicted  phenomena  by  theoretical  or  mathematical  analysis. 

11.  Explained  observed  phenomena  through  theory  or  analogous  situation  in  the 
same  field  or  a related  field. 


II.  PLANNING  AND  DESIGNING  THE  INVESTIGATION 

12.  Included  all  relevant  sources  in  surveying  the  literature  or  consulting 
experts. 

13.  Performed  experiments  or  gathered  necessary  information  directly  which 
was  not  available  in  usual  sources. 

14.  Based  research  plan  on  assumptions  which  closely  approximated  actual 
conditions. 

15.  Secured  evidence  of  validity  of  assumptions. 

16.  Provided  for  control  and  systematic  variation  of  all  relevant  variables. 

17.  Treated  the  various  factors  in  accordance  with  their  relative  importance. 

18.  Included  all  relevant  factors  or  phases  in  the  investigation. 

19.  Included  methods  of  integrating  one  factor  or  phase  with  others. 

20.  Pointed  out  the  basic  factors  in  a mass  of  information  about  the  problem. 

21.  Tried  out  various  approaches  to  problem  before  choosing  one. 

22.  Made  provision  fcr  an  alternate  approach  or  for  handling  difficulties  which 
might  arise  at  later  stages. 

23.  Used  equipment,  material,  or  techniques  which  met  the  requirements  of  the 
problem. 

24.  Used  technique  or  equipment  which  eliminated  doubt  of  validity  or  accuracy 
of  results. 

25.  Used  the  latest  development  of  an  appropriate  equipment,  technique,  or 
material. 
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III.  CONDUCTING  THE  INVESTIGATION 

26.  Devised  an  improved  method,  material,  or  equipment. 

27.  Developed  an  entirely  new  and  effective  method,  material,  or  equipment 
to  fill  a need. 

28.  Adapted  available  methods,  materials,  or  equipment  to  meet  require- 
ments of  new  problem. 

29.  Showed  experimentally  the  capabilities  of  method,  material,  or  equip- 
ment he  developed. 

30.  Used  a technique,  material,  or  equipment  which  solved  problem  or 
eliminated  difficulty  in  the  investigation. 

31.  Modified  work  to  incorporate  latest  research  findings. 

32.  Presented  unique  solution  or  technique  developed  by  mathematical  analysis. 

33.  Transformed  physical  problem  so  that  it  could  be  solved  by  mathematical 
analysis. 

34.  Correctly  interpreted  implications  of  fundamental  theory  in  explaining 
application  to  a problem. 

35.  Performed  work  which  met  standards  for  accuracy. 

36.  Used  data  analysis  method  which  was  well  suited  to  give  required 
information. 

37.  Completed  only  analyses  necessary  for  data  obtained. 

38.  Made  all  necessary  mathematical  analyses  of  data. 

IV.  INTERPRETING  RESEARCH  RESULTS 

39.  Presented  results  of  a check  on  validity  of  conclusions. 

40.  Drew  all  conclusions  from  the  data  that  were  justifiable. 

41.  Drew  conclusions  only  from  complete,  adequate,  and  correct  data. 

42.  Pointed  out  new  and  useful  implications  and  possible  extensions  of  work. 

V.  PREPARING  REPORTS 

43.  Used  graphic,  tabular,  or  pictorial  material  to  clarify  text. 

44.  Explained  new  material  when  first  introduced. 

45.  Gave  examples  of  practical  applications  or  a simplified  statement  of 
complex  theory. 

46.  Followed  a logical  outline. 

47.  Placed  references  in  appropriate  location. 

48.  Presented  figures  or  tables  in  order  corresponding  to  text. 

49.  Heightened  interest  and  stimulated  thought  by  skillful  manner  of 
presentation. 

50.  Used  a style  adapted  to  probable  readers. 


\ 
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We  would  like  you  to  make  an  over -all  judgment  about  the  research  you  have  just 
evaluated.  In  making  this  judgment  it  might  be  helpful  to  consider  whether  you  think 
the  time  and  money  required  for  the  project  were  well  spent,  whether  the  research 
made  a significant  contribution  to  knowledge  in  the  field,  whether  findings  of  this 
study  were  substantiated  by  later  research,  and  whether  you  would  recommend 
that  other  persons  read  this  report.  It  is  especially  important  to  keep  in  mind 
that  negative  results  do  not  necessarily  mean  the  research  made  no  significant 
contribution  to  knowledge  in  the  field.  Taking  these  points  into  consideration, 
please  check  the  one  statement  below  which  you  consider  to  be  most  true.  These 
statements  should  be  considered  definitions  of  five  points  on  a continuum. 

This  research  made  an  especially  significant  contribution  to 

knowledge  in  the  field.  I would  strongly  recommend  that  all 
persons  in  the  field  read  this  report. 

This  research  made  a relatively  significant  contribution  to 

knowledge  in  the  field.  I would  recommend  that  interested 
persons  in  the  field  read  this  report. 

This  research  made  a small  but  definite  contribution  to 

knowledge  in  the  field.  I would  suggest  that  interested 
persons  in  the  field  might  find  this  report  of  some  value. 

This  research  contributed  almost  nothing  to  knowledge  in 

the  field.  I would  recommend  to  few  persons  in  the  field 
that  they  read  this  report. 

This  research  contributed  nothing  to  knowledge  in  the 

f field  and  probably  has  misled  some  workers  in  the  field. 

I would  never  recommend  that  other  persons  in  the  field 
read  this  report. 


You  should  now  circle  important  items 
as  described  under  3 of  the  instructions. 


Comments: 
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Chapter  V 


CONCLUSION 

Conclusions  and  Recommendations 


1 . The  Record  Form  for  Evaluating  Research  Through  the  Report  which  has  been 
developed  provides  an  outline  of  important  points  for  consideration  in  evaluating 
research  through  the  report.  The  items  tried  out  in  this  study  were  taken 
directly  from  critical  behaviors  suggested  by  a large  number  of  senior  research 
workers.  Only  those  items  which  were  found  to  be  valid  in  this  study  were  in- 
cluded on  the  form. 

2.  The  Record  Form  was  developed  in  a manner  which  should  tend  to  maximize 
its  validity,  but  the  form  has  not  been  tried  out.  It  is  suggested  that  a full- 
scale  tryout  of  the  form  for  various  research  functions  and  scientific  and 
engineering  disciplines  is  desirable. 

3.  The  comments  of  evaluators  concerning  the  trial  Record  Form  suggest  that  the 
estimate  of  inter -evaluator  reliability  obtained  in  this  study  for  the  50  selected 
items  may  be  low  since  the  large  number  of  items  on  the  trial  form  probably 
tended  to  reduce  the  care  with  which  individual  items  were  answered.  It  is 
expected  that  evaluators  using  the  shorter  revised  Record  Form  will  be  able 

to  exert  greater  care  in  responding  to  individual  items.  Nevertheless,  the 
evidence  obtained  indicates  that  scores  on  the  50  selected  items  are  not  very 
reliable  from  one  evaluator  to  another.  Although  agreement  between  independent 
over -all  judgments  of  the  value  of  research  is  only  moderate,  it  does  appear  to 
be  greater  than  agreement  between  scores  obtained  from  checking  the  occurrence 
of  behaviors.  It  is  entirely  possible  that  consideration  of  a number  of  be- 
havioral items  prior  to  making  the  over -all  judgment  might  improve  the  reliability 
and  validity  of  the  over-all  judgment,  but  no  data  are  available  to  test  this 
possibility.  It  is  therefore  recommended  that: 

a.  checking  of  behaviors  be  used  at  this  time  only  as  an  aid  to  arriving 
at  a careful  over-all  judgment  and  that  ordinarily  no  total  score  be 
obtained  for  comparison  of  various  reports  evaluated  by  different 
evaluators.  There  are,  however,  two  circumstances  in  which  the 
obtaining  of  total  scores  mig>t  be  justified.  They  are: 

(1)  when  each  evaluator  completes  a Record  Form  for  a number  of 
different  reports,  making  it  possible  to  convert  each  total  score 
into  a rank  or  standard  score  from  the  evaluator's  distribution 
of  evaluation  s,  and 

(2)  when  each  report  is  evaluated  by  a number  of  persons,  making 
it  possible  to  obtain  an  average  of  total  scores  for  each  report. 


b.  Whenever  possible,  each  report  should  be  evaluated  by  more  than  one 
person.  The  expected  effect  of  such  multiple  evaluations  upon  the  inter- 
evaluator reliability  of  the  item  total  scores  and  over-all  judgments,  and 
the  expected  effect  upon  the  validity  of  over-all  judgments  are  indicated  in 
Table  IV.  It  should  be  pointed  out  that  it  is  much  less  difficult  to  obtain 
multiple  evaluations  of  research  through  the  report  than  to  obtain  evaluations 
of  research  through  direct  observation,  since  the  report  can  be  evaluated 
by  a number  of  competent  persons  at  different  times  and  in  scattered 
geographical  locations. 


Table  IV 


Estimates  of  Reliability  and  Validity  for  Evaluations  Obtained 
by  Combining  Varying  Numbers  of  Individual  Evaluations 


Obtained 

Reliability 

Estimated 

Reliability* 

Obtained 

Validity 

Estimated 

Validity** 

r 

XX 

n=3 

n=5 

o 
<— « 
ii 

c 

n=20 

r 

xy 

n=3 

n=5 

n=10 

n=20 

Over -all  judgment 

. 35 

. 62 

.73 

.84 

.92 

.36 

.48 

. 52 

. 56 

. 58 

Total  item  score 

. 16 

. 36 

.49 

.66 

.79 

- 

- 

- 

- 

- 

* From  the  Spearman-Brown  formula 
**  From  the  following  formula: 


T{nx)y  J1  " r*x 

V n * rxx 
is  an  individual  evaluation, 

is  the  obtained  correlation  between  X(n=l)  and  Y, 

is  the  obtained  inter -evaluator  reliability  of  X, 

is  the  number  of  evaluations  of  a given  report, 

is  a combination  of  n independent  evaluations,  and 

is  the  predicted  correlation  between  a criterion,  Y(selectors' 

judgment),  and  an  average  of  n evaluations. 


where  x 
r 


xy 
r 

xx 

n 

nx 


(nx)y 


c.  Methods  for  reducing  the  variation  in  evaluators'  standards  for  checking 
behaviors  should  be  investigated.  It  is  suggested  that  both  the  effect  of 
using  different  item  forms  and  the  possibility  of  obtaining  area  judgments 
in  addition  to  the  over-all  judgment  might  be  fruitfully  investigated. 


\ 
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Significance  of  the  Research  Findings 


/ 


One  finding  of  this  study  seems  to  be  of  greater  significance  than  all  others. 
This  is  the  consistently  small  amount  of  agreement  between  two  independent 
evaluations  of  a given  piece  of  research  through  the  report.  Agreement  between 
selectors'  and  evaluators'  over-all  judgments  concerning  the  research,  between 
evaluators'  individual  item  responses  and  the  selectors'  judgments,  and  between 
two  evaluators'  independent  item  responses  for  the  same  report  all  tend  to  be  small. 


We  might  suggest  at  least  four  possible  explanations  for  this  slight  agree- 
ment 

1.  There  is  no  way  in  which  research  can  be  evaluated  with  satisfactory  in- 
dependent agreement  among  different  qualified  evaluators. 

2,  There  are  ways  in  which  research  may  be  evaluated  with  satisfactory 
independent  agreement  among  different  qualified  evaluators  but  there  can 
be  no  satisfactory  method  for  evaluating  research  through  the  report. 

3,  There  are  methods  for  evaluating  research  through  the  report  which 
would  provide  satisfactory  independent  agreement  among  qualified 
evaluators  and  these  are  known  clearly  to  research  workers  in  physics 
chemistry,  and  engineering  but  they  are  not  known  to  personnel  research 
workers  conducting  this  study. 

4.  There  could  be  methods  for  evaluating  research  through  the  report  which 
would  provide  satisfactory  agreement  among  qualified  persons  evaluating  the 
research  independently;  but  the  basic  knowledge  concerning  the  complex 
operations  involved  in  satisfactory  evaluations  is  not  possessed  either  by 
personnel  research  workers  or  research  workers  in  the  fields  of  physics 
chemistry,  and  engineering. 

If  we  accept  the  first  explanation  above,  we  are  forced  to  posit  that  the 
operations  which,  in  sequence,  constitute  a piece  of  research  are  not  amenable  to 
rational-quantitative  measurement.  Otherwise,  there  is  no  reason  why  two  equally 
competent  measurers  should  not  arrive  at  similar  measures  on  the  appropriate 
dimensions.  If  research  is  not  amenable  to  measurement  (evaluation)  it  then 
follows  that  it  differs  qualitatively  from  the  many  other  human  activities  which 
have  been  successfully  measured.  It  would  certainly  be  agreed  that  research  is  a 
very  complex  activity,  but  this  complexity  should  only  make  measurement  more 
difficult  not  impossible 

Acceptance  of  the  second  explanation  would  certainly  be  a major  indictment 
of  the  way  in  which  research  reports  are  now  written,.  It  is  commonly  accepted  that 
research  reports  are  supposed  to  communicate  why  the  investigation  was  undertaken 
how  it  was  conducted,  what  the  results  were,  and  what  the  investigator  concluded 
from  the  results.  If  this  information  is  not  available  in  the  printed  reports,  it 
would  seem  proper  to  question  the  value  of  mass  publication  of  such  reports.  Of 
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course,  as  has  previously  been  mentioned,  not  all  information  relevant  to  a given 
piece  of  research  can  be  published,  particularly  in  abbreviated  treatments  such  as 
journal  articles.  It  would  seem  then  that  certain  aspects  of  research  can  be 
evaluated  only  through  direct  observation  and  not  through  the  report.  But  this  is  no 
adequate  justification  for  a belief  that  the  aspects  of  research  which  are  routinely 
reported  in  published  articles  are  not  subject  to  evaluation. 

Our  experience  in  this  study  does  not  lend  support  to  the  third  explanation. 
Interviews  with  workers  in  various  scientific  and  engineering  fields  during  early 
phases  of  the  study  and  comments  of  participating  evaluators  revealed  wide  variation 
in  suggested  methodology.  There  do  not  seem  to  be  any  standards  by  which  different 
research  workers  consistently  evaluate  research  through  the  report.  Emphasis 
upon  one  or  another  aspect  of  research  for  evaluation  seems  to  depend  to  a large 
extent  upon  the  individual  worker. 

The  last  suggested  explanation  seems  closest  to  adequate  explanation  as 
to  why  agreement  among  working  researchers  was  so  small.  The  dimensions  on 
which  a research  report  can  be  evaluated  with  a sufficient  amount  of  agreement 
between  independent  evaluators  have  not  yet  been  isolated  and  defined.  That  this  is 
true  should  not  be  surprising  for,  as  has  been  pointed  out,  research  is  an  extremely 
complex  activity.  Since  careful  empirical  studies  of  evaluation  of  research  through 
the  report  have  not  been  conducted,  we  cannot  expect  the  answer  to  such  a complex 
problem  to  come  with  little  or  no  effort  to  all  who  desire  it.  We  can  expect  that  it 
will  be  necessary  to  study  the  problem  intensively  with  the  aid  of  rigorous  logical 
and  mathematical  principles  before  anything  approaching  a satisfactory  solution  can 
be  obtained. 

It  is  little  wonder,  then,  that  standards  for  evaluating  research  through  the 
report  are  now  implicit,  highly  individual,  and  obscure.  No  doubt  there  are  those 
who  will  be  content  to  accept  on  faith  that  the  many  research  reports  to  which  each 
individual  research  worker  is  exposed  are  evaluated  by  him  in  a manner  satisfactory 
for  his  own  purposes,  even  though  his  evaluation  is  not  in  agreement  with  other 
workers  in  the  field.  It  is  the  opinion  of  the  present  investigators,  however,  that 
an  empirical  demonstration  by  the  rational -quantitative  approach  is  preferable 
to  unsupported  intuitive  judgment. 

Certainly,  the  crucial  position  held  in  our  society  by  research  workers  and 
the  importance  to  their  work  of  effective  communication  among  them  should  compel 
careful  consideration  of  the  problem. 
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